29 research outputs found

    Center-based Clustering under Perturbation Stability

    Full text link
    Clustering under most popular objective functions is NP-hard, even to approximate well, and so unlikely to be efficiently solvable in the worst case. Recently, Bilu and Linial \cite{Bilu09} suggested an approach aimed at bypassing this computational barrier by using properties of instances one might hope to hold in practice. In particular, they argue that instances in practice should be stable to small perturbations in the metric space and give an efficient algorithm for clustering instances of the Max-Cut problem that are stable to perturbations of size O(n1/2)O(n^{1/2}). In addition, they conjecture that instances stable to as little as O(1) perturbations should be solvable in polynomial time. In this paper we prove that this conjecture is true for any center-based clustering objective (such as kk-median, kk-means, and kk-center). Specifically, we show we can efficiently find the optimal clustering assuming only stability to factor-3 perturbations of the underlying metric in spaces without Steiner points, and stability to factor 2+32+\sqrt{3} perturbations for general metrics. In particular, we show for such instances that the popular Single-Linkage algorithm combined with dynamic programming will find the optimal clustering. We also present NP-hardness results under a weaker but related condition

    Differentially Private Data Analysis of Social Networks via Restricted Sensitivity

    Full text link
    We introduce the notion of restricted sensitivity as an alternative to global and smooth sensitivity to improve accuracy in differentially private data analysis. The definition of restricted sensitivity is similar to that of global sensitivity except that instead of quantifying over all possible datasets, we take advantage of any beliefs about the dataset that a querier may have, to quantify over a restricted class of datasets. Specifically, given a query f and a hypothesis H about the structure of a dataset D, we show generically how to transform f into a new query f_H whose global sensitivity (over all datasets including those that do not satisfy H) matches the restricted sensitivity of the query f. Moreover, if the belief of the querier is correct (i.e., D is in H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be inaccurate. We demonstrate the usefulness of this notion by considering the task of answering queries regarding social-networks, which we model as a combination of a graph and a labeling of its vertices. In particular, while our generic procedure is computationally inefficient, for the specific definition of H as graphs of bounded degree, we exhibit efficient ways of constructing f_H using different projection-based techniques. We then analyze two important query classes: subgraph counting queries (e.g., number of triangles) and local profile queries (e.g., number of people who know a spy and a computer-scientist who know each other). We demonstrate that the restricted sensitivity of such queries can be significantly lower than their smooth sensitivity. Thus, using restricted sensitivity we can maintain privacy whether or not D is in H, while providing more accurate results in the event that H holds true

    Differentially Private Approximations of a Convex Hull in Low Dimensions

    Get PDF
    We give the first differentially private algorithms that estimate a variety of geometric features of points in the Euclidean space, such as diameter, width, volume of convex hull, min-bounding box, min-enclosing ball, etc. Our work relies heavily on the notion of Tukey-depth. Instead of (non-privately) approximating the convex-hull of the given set of points P, our algorithms approximate the geometric features of D_{P}(?) - the ?-Tukey region induced by P (all points of Tukey-depth ? or greater). Moreover, our approximations are all bi-criteria: for any geometric feature ? our (?,?)-approximation is a value "sandwiched" between (1-?)?(D_P(?)) and (1+?)?(D_P(?-?)). Our work is aimed at producing a (?,?)-kernel of D_P(?), namely a set ? such that (after a shift) it holds that (1-?)D_P(?) ? CH(?) ? (1+?)D_P(?-?). We show that an analogous notion of a bi-critera approximation of a directional kernel, as originally proposed by [Pankaj K. Agarwal et al., 2004], fails to give a kernel, and so we result to subtler notions of approximations of projections that do yield a kernel. First, we give differentially private algorithms that find (?,?)-kernels for a "fat" Tukey-region. Then, based on a private approximation of the min-bounding box, we find a transformation that does turn D_P(?) into a "fat" region but only if its volume is proportional to the volume of D_P(?-?). Lastly, we give a novel private algorithm that finds a depth parameter ? for which the volume of D_P(?) is comparable to the volume of D_P(?-?). We hope our work leads to the further study of the intersection of differential privacy and computational geometry

    Graph coloring with no large monochromatic components

    Full text link
    For a graph G and an integer t we let mcc_t(G) be the smallest m such that there exists a coloring of the vertices of G by t colors with no monochromatic connected subgraph having more than m vertices. Let F be any nontrivial minor-closed family of graphs. We show that \mcc_2(G) = O(n^{2/3}) for any n-vertex graph G \in F. This bound is asymptotically optimal and it is attained for planar graphs. More generally, for every such F and every fixed t we show that mcc_t(G)=O(n^{2/(t+1)}). On the other hand we have examples of graphs G with no K_{t+3} minor and with mcc_t(G)=\Omega(n^{2/(2t-1)}). It is also interesting to consider graphs of bounded degrees. Haxell, Szabo, and Tardos proved \mcc_2(G) \leq 20000 for every graph G of maximum degree 5. We show that there are n-vertex 7-regular graphs G with \mcc_2(G)=\Omega(n), and more sharply, for every \epsilon>0 there exists c_\epsilon>0 and n-vertex graphs of maximum degree 7, average degree at most 6+\epsilon for all subgraphs, and with mcc_2(G)\ge c_\eps n. For 6-regular graphs it is known only that the maximum order of magnitude of \mcc_2 is between \sqrt n and n. We also offer a Ramsey-theoretic perspective of the quantity \mcc_t(G).Comment: 13 pages, 2 figure
    corecore